12 research outputs found
Concentration inequalities for order statistics
This note describes non-asymptotic variance and tail bounds for order
statistics of samples of independent identically distributed random variables.
Those bounds are checked to be asymptotically tight when the sampling
distribution belongs to a maximum domain of attraction. If the sampling
distribution has non-decreasing hazard rate (this includes the Gaussian
distribution), we derive an exponential Efron-Stein inequality for order
statistics: an inequality connecting the logarithmic moment generating function
of centered order statistics with exponential moments of Efron-Stein
(jackknife) estimates of variance. We use this general connection to derive
variance and tail bounds for order statistics of Gaussian sample. Those bounds
are not within the scope of the Tsirelson-Ibragimov-Sudakov
Gaussian concentration inequality. Proofs are elementary and combine
R\'enyi's representation of order statistics and the so-called entropy approach
to concentration inequalities popularized by M. Ledoux.Comment: 13 page
About Adaptive Coding on Countable Alphabets: Max-Stable Envelope Classes
In this paper, we study the problem of lossless universal source coding for
stationary memoryless sources on countably infinite alphabets. This task is
generally not achievable without restricting the class of sources over which
universality is desired. Building on our prior work, we propose natural
families of sources characterized by a common dominating envelope. We
particularly emphasize the notion of adaptivity, which is the ability to
perform as well as an oracle knowing the envelope, without actually knowing it.
This is closely related to the notion of hierarchical universal source coding,
but with the important difference that families of envelope classes are not
discretely indexed and not necessarily nested.
Our contribution is to extend the classes of envelopes over which adaptive
universal source coding is possible, namely by including max-stable
(heavy-tailed) envelopes which are excellent models in many applications, such
as natural language modeling. We derive a minimax lower bound on the redundancy
of any code on such envelope classes, including an oracle that knows the
envelope. We then propose a constructive code that does not use knowledge of
the envelope. The code is computationally efficient and is structured to use an
{E}xpanding {T}hreshold for {A}uto-{C}ensoring, and we therefore dub it the
\textsc{ETAC}-code. We prove that the \textsc{ETAC}-code achieves the lower
bound on the minimax redundancy within a factor logarithmic in the sequence
length, and can be therefore qualified as a near-adaptive code over families of
heavy-tailed envelopes. For finite and light-tailed envelopes the penalty is
even less, and the same code follows closely previous results that explicitly
made the light-tailed assumption. Our technical results are founded on methods
from regular variation theory and concentration of measure
Moment inequalities for functions of independent random variables
A general method for obtaining moment inequalities for functions of
independent random variables is presented. It is a generalization of the
entropy method which has been used to derive concentration inequalities for
such functions [Boucheron, Lugosi and Massart Ann. Probab. 31 (2003)
1583-1614], and is based on a generalized tensorization inequality due to
Latala and Oleszkiewicz [Lecture Notes in Math. 1745 (2000) 147-168]. The new
inequalities prove to be a versatile tool in a wide range of applications. We
illustrate the power of the method by showing how it can be used to
effortlessly re-derive classical inequalities including Rosenthal and
Kahane-Khinchine-type inequalities for sums of independent random variables,
moment inequalities for suprema of empirical processes and moment inequalities
for Rademacher chaos and U-statistics. Some of these corollaries are apparently
new. In particular, we generalize Talagrand's exponential inequality for
Rademacher chaos of order 2 to any order. We also discuss applications for
other complex functions of independent random variables, such as suprema of
Boolean polynomials which include, as special cases, subgraph counting problems
in random graphs.Comment: Published at http://dx.doi.org/10.1214/009117904000000856 in the
Annals of Probability (http://www.imstat.org/aop/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Adaptive compression against a countable alphabet
International audienceThis paper sheds light on universal coding with respect to classes of memoryless sources over a countable alphabet defined by an envelope function with finite and non-decreasing hazard rate. We prove that the auto-censuring (AC) code introduced by Bontemps (2011) is adaptive with respect to the collection of such classes. The analysis builds on the tight characterization of universal redundancy rate in terms of metric entropy by Haussler and Opper (1997) and on a careful analysis of the performance of the AC-coding algorithm. The latter relies on non-asymptotic bounds for maxima of samples from discrete distributions with finite and non-decreasing hazard rate
Apprentissage et calculs
SIGLECNRS T Bordereau / INIST-CNRS - Institut de l'Information Scientifique et TechniqueFRFranc
Model Selection and Error Estimation
We study model selection strategies based on penalized empirical loss minimization. We point out a tight relationship between error estimation and data-based complexity penalization: any good error estimate may be converted into a data-based penalty function and the performance of the estimate is governed by the quality of the error estimate. We consider several penalty functions, involving error estimates on independent test data, empirical VC dimension, empirical VC entropy, and margin-based quantities. We also consider the maximal difference between the error on the first half of the training data and the second half, and the expected maximal discrepancy, a closely related capacity estimate that can be calculated by Monte Carlo integration. Maximal discrepancy penalty functions are appealing for pattern classification problems, since their computation is equivalent to empirical risk minimization over the training data with some labels flipped
Pattern Coding Meets Censoring: (almost) Adaptive Coding on Countable Alphabets
Adaptive coding faces the following problem: given a collection of source classes such that each class in the collection has non-trivial minimax redundancy rate, can we design a single code which is asymptotically minimax over each class in the collection? In particular, adaptive coding makes sense when there is no universal code on the union of classes in the collection. In this paper, we deal with classes of sources over an infinite alphabet, that are characterized by a dominating envelope. We provide asymptotic equivalents for the redundancy of envelope classes enjoying a regular variation property. We finally construct a computationally efficient online prefix code, which interleaves the encoding of the so-called pattern of the message and the encoding of the dictionary of discovered symbols. This code is shown to be adaptive, within a loglogn factor, over the collection of regularly varying envelope classes. The code is both simpler and less redundant than previously described contenders. In contrast with previous attempts, it also covers the full range of slowly varying envelope classes